Partial Change Phone Models for Pronunciation Variations in Spontaneous Mandarin Speech
نویسندگان
چکیده
Modeling pronunciation variations is a critical part of spontaneous Mandarin speech recognition. Such variations include both complete changes and partial changes. Complete pronunciation changes can usually be modeled by using an alternative phone to replace the canonical phoneme. Partial changes are variations within the phoneme and include diacritics, which cannot be modeled by conventional methods. In this paper, we propose using partial change phone models to represent such changes. The pre-trained acoustic model is reconstructed by sharing Gaussian mixtures between canonical phone models and partial change phone models at the state level. We improve the resolution of the acoustic model to accommodate partial changes. The effectiveness of this approach is evaluated on the Hub4NE Mandarin Broadcast News Corpus. The syllable accuracy increased 2.59% absolutely with respect to the baseline.
منابع مشابه
Pronunciation Modeling for Spontaneous Mandarin Speech Recognition
Pronunciation variations in spontaneous speech can be classified into complete changes and partial changes. A complete change is the replacement of a canonical phoneme by another alternative phone, such as ‘b’ being pronounced as ‘p’. Partial changes are variations within the phoneme such as nasalization, centralization and voiced. Most current work in pronunciation modeling for spontaneous Man...
متن کاملModel Partial Pronunciation Var Mandarin Speech Re
Modeling pronunciation variations is a critical part of spontaneous Mandarin speech recognition. Such variations include both complete changes and partial changes. Complete changes can usually be modeled by using an alternate phone to replace the canonical phone. Partial changes, which cannot be modeled by conventional methods are variations within the phoneme and include diacritics. In this pa...
متن کاملModel partial pronunciation variations for spontaneous Mandarin speech recognition
The high error rate in spontaneous speech recognition is due in part to the poor modeling of pronunciation variations. An analysis of acoustic data reveals that pronunciation variations include both complete changes and partial changes. Complete changes are the replacement of a canonical phoneme by another alternative phone, such as b being pronounced as p . Partial changes are the variations w...
متن کاملTriphone model reconstruction for Mandarin pronunciation variations
The high error rate of recognition accuracy in spontaneous speech is due in part to the poor modeling of pronunciations. In this paper, we propose modeling pronunciation variations through triphone model reconstruction. We first generate partial change phone model (PCPM) to differentiate pronunciation variations. In order to improve the resolution of triphone models, PCPM is used as a hidden mo...
متن کاملModelling pronunciation variations in spontaneous Mandarin speech
Pronunciation in spontaneous Mandarin speech tends to be much more variable than in read speech. In current recognition systems, pronunciation dictionaries usually only contain one standard pronunciation for each word, so that the amount of variability that can be modelled is very limited. Most recent research work for modelling variations in spontaneous speech focuses on the lexicon level, whi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002